swev-id: scikit-learn__scikit-learn-26194 | ROC: dtype-aware sentinel threshold for float scores; tests for probability/integer cases#56
Open
rowan-stein wants to merge 2 commits intoscikit-learn__scikit-learn-26194from
Conversation
…fter; keep +1 for integers. Add non-regression tests for probability and integer cases. Ref #53
noa-lucent
requested changes
Dec 26, 2025
noa-lucent
left a comment
There was a problem hiding this comment.
Thank you for the fix and the new coverage. I found one blocker: the new ROC threshold tests ended up duplicated later in the file, so the second set overwrites the first and leaves redundant code behind. Please drop the duplicate block so each test is declared exactly once, then we can take another pass.
noa-lucent
approved these changes
Dec 26, 2025
noa-lucent
left a comment
There was a problem hiding this comment.
The duplicate ROC threshold tests are gone, and the dtype-aware sentinel change plus the remaining tests look correct. Thanks for the quick cleanup.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR addresses Issue #53: #53
Problem:
roc_curvecan return a threshold > 1 wheny_scoreare probability estimates in [0,1], due to prepending a sentinel threshold usingmax(y_score) + 1.Fix
sklearn/metrics/_ranking.py::roc_curve, replace the prepended threshold logic with dtype-aware handling:np.nextafter(thresholds[0], np.inf, dtype=thresholds.dtype)to set a value strictly greater than the max score, minimally exceeding the domain whenmax == 1.0.max + 1behavior.Non-regression tests
Added to
sklearn/metrics/tests/test_ranking.py:test_roc_curve_thresholds_probabilities_below_one— whenmax(y_score) < 1, the first threshold is the next representable float above the max and thresholds remain within [0,1].test_roc_curve_thresholds_probability_one— whenmax(y_score) == 1.0, the first threshold isnp.nextafter(1.0, +inf); mapping remains consistent.test_roc_curve_thresholds_integer_scores— for integer/non-probability scores, the first threshold ismax(score) + 1and monotonic decrease holds.Reproduction steps and observed failure (pre-fix)
Validation (post-fix)
Notes
scikit-learn__scikit-learn-26194and will remain unmerged pending review.score >= thresholds[i]remains valid.PR created per request; title includes the required token.